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ABSTRACT 

Actually  very  interesting  IT  systems  promise  to  reveal  connections  between  apparently  harmless  and 
unrelated  information  pieces.  An  article  from  the  New  York  Times  in  February  20062  makes  clear  that 
common  data  mining  techniques  were  not  successful  in  general.  Despite  huge  investments,  correlating 
data  from  different  sources  did  not  yield  satisfactory  results.  Transforming  low-level  data  by  aggregation 
to  meaningful  events  is  nevertheless  the  key  to  building  the  basis  for  succeeding  decisions  in  the  context  of 
situation  reports 

More  realistic  and  manageable  is  an  approach  that  includes  interactions  with  the  user  along  with  domain 
specific  knowledge.  Gaining  security  relevant  messages  should  be  based  on  an  iterative  multi-level 
process.  This  process  represents  the  core  element  of  intelligence  analysis  systems  which  play  an  important 
role  for  supporting  decisions  in  management  information  systems 3. 

The  following  example  illustrates  the  principal  automated  process  for  discovering  communication 
structures  in  the  context  of  radio  reconnaissance:  A  crucial  part  of  this  process  is  the  analysis  and 
visualisation  of  communication  structures,  or  more  generally,  of  network  information.  This  should  be 
embedded  in  spatio-temporal  data  analysis  with  geo-oriented  data  access  and  the  integration  of  domain- 
specific  analysis  functions. 


Figure  1:  Domain-specific  Analysis  Function 


Figure  2:  Spatial  Access 
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ORGANIZATION 


The  intelligent  analysis  of  radio  emission  data  is  based  on  data  mining  techniques,  cluster  visualisations 
to  validate  the  results,  a  model  based  communication  detection  (including  domain-specific  knowledge) 
and  the  visualisation  of  communications.  The  following  use  case  of  a  simple  simplex  communication 
clarifies  the  problems  and  the  applied  methods.  Module  coupling  is  realised  by  a  distributed  architecture. 
Given  are  a  huge  amount  of  radio  emissions  which  are  arbitrarily  distributed.  Each  emission  is  described 
by  the  attributes  ID,  frequency,  modulation  type,  starting  time,  end  time,  latitude  and  longitude.  It  has  to 
be  considered  that  the  data  quality  of  single  emissions  depends  on  propagation  conditions.  Because  these 
can  vary,  it  can  happen  that  single  emissions  or  attributes  are  missing  or  on  the  other  hand  different 
classification  level  information  are  available.  Furthermore,  with  a  broadband  collection  of  emissions  the 
amount  of  information  is  extremely  large  and  requires  massive  data  handling  which  can  not  be  processed 
in  main  memory. 

1.0  USE-CASE  SIMPLEX-COMMUNICATION 

The  use  case  is  looking  for  a  simplex  communication  chain  with  two  stationary  partners  -  a  central  station 
and  a  substation.  Both  are  using  the  same  constant  nominal  frequency  and  the  same  transmission  mode. 
The  partners  are  communicating  alternating  one  after  the  other.  The  problem  lies  in  the  amount  of  possible 
communication  structure  instances.  Although  the  communication  can  be  easily  described  in  an  informal 
way  it  is  necessary  to  find  an  exact,  formal  specification  in  order  to  perform  a  computer-supported 
analysis.  It  should  not  be  realised  by  a  specific  static  algorithm  but  should  be  interactively  and 
exploratively  changeable  by  the  user.  The  core  concept  includes  the  following  steps: 

1.1  Data  Mining 

During  the  first  step  emissions  are  assigned  to  clusters.  These  subsume  emissions  concerning  the  spatial, 
temporal  or  frequency  criteria.  In  this  way  significant  data  reduction  is  achieved.  By  spatial  clustering 
special  emitter  station  could  be  determined.  Besides  when  processing  of  extremly  huge  data  amounts  the 
main  problem  to  solve  is  how  to  choose  the  best  method  and  parameters. 

1.2  Cluster  Visualisation 

The  next  step  serves  the  validation  of  the  data  mining  results  and  already  provides  a  possibility  to 
manually  discover  communication  structures  by  the  user  relying  on  the  presented  visualisation,  for 
example  the  presentation  of  spatial  clusters.  Emission  can  appear  as  single  instances  or  as  temporal 
ordered  parts  of  a  cluster.  It  is  difficult  to  visualise  the  emissions  and  clusters  clearly  arranged  in  order  to 
focus  on  the  actual  interesing  data.  Additionally  different  attributes  have  to  be  integrated. 


Computing  communication  structures  from  clusters 
is  the  next  step.  This  is  done  by  using  typical 
communication  models.  A  domain  specific  modelling 
language  provides  the  possibility  to  represent  the 
communication  models.  By  this  language  the  simplex 
communication  can  be  formally  specified.  The  model 
distinguishes  between  connection  constitution  and 
alternating  communications. 


1.3  Model  based  communication  detection 


Figure  3:  Simplex  Communication  Rules 
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Simplex  Kommunikation 
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Figure  4:  Simplex  Communication  Instance 


The  connection  constitution  consists  of  three 
emissions:  the  central  station  is  sending,  the  substation 
replies.  The  alternating  sequence  consists  of  emissions 
of  the  central  and  the  sub  station.  All  emissions  occur 
with  the  same  frequency  and  modulation  type.  The 
distance  between  the  emission  is  flexible  by  a  delay 
parameter.  A  graphical  notation  of  such  a  model  is 
illustrated  by  the  two  adjoining  pictures. 


1 .4  Visualisation  of  discovered  communications 

This  step  provides  a  presentation  of  the  discovered 
communications  and  allows  by  this  a  validation  of  the 
model  based  communication  detection.  It  has  to  deal 
with  many  composite  events.  A  simple  textual 
visualisation  does  not  meet  the  needs.  The  graphical 
visualisation  offers  a  better  overview  and  manifold 
interaction  possibilities. 

The  emitters  are  on  the  spatial  level,  time  is  the  third 
dimension.  The  connection  lines  indicate  the 
communications. 

Figure  5:  Discoverd  Network  Communications 


Overcoming  massive  data  streams  for  intelligence  tasks  is  a  challenge  which  should  involve  the  analysis 
process  with  a  seamless  data  access  and  the  intelligence  analyst.  The  acceptence  of  the  results  depends  on 
the  possiblity  to  validate  the  results.  The  sustainablity  of  results  has  to  be  guaranteed  by  flexible  extension 
of  actual  domain  specific  analysis  methods. 
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Agenda 


■  Motivation 

■  Basic  Concept 

■  Exploration  Process 

■  Summary 


Motivation 


■  ACOS  delivers  huge  amount  of  data 

■  Intelligence  requires  to  add  value 

■Task:  Transforming  low-level  data  by  aggregation  to 
meaningful  events  is  the  key  to  build  the  basis  for 
succeeding  decisions  in  the  context  of  situation 
reports. 

■  Problem:  “Common  data  mining  techniques  were  not 
successful  in  general.  Despite  huge  investments, 
correlating  data  from  different  sources  did  not  yield 
satisfactory  results.” 

(New  York  Times  in  February  2006,  Taking  Spying  to  Higher  Level,  ...) 

Domain  specific  approach  is  nessecary 


Motivation 


■  Solution:  KDD  -  Overcoming  massive  data 
streams  for  intelligence  tasks 

■  Domain  specific  approach  - 

is  based  on  reconnaissance  know-how 

■  More  realistic  and  manageable:  interactions  with  the 
user  along  with  domain  specific  knowledge. 

■  Gaining  security  relevant  messages  should  be  based 
on  an  iterative  multi-level  process.  This  process 
represents  the  core  element  of  intelligence  analysis 
systems  which  play  an  important  role  for  supporting 
decisions  in  management  information  systems. 
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ICAP  focus  of  interest 


■  ACOS  provides  for  the  1st  time: 

■  broadband  complete  coverage 

■  collection  of  all  emissions  on  air  are  stored  in  database 

■  How  can  this  information  be  exploited  ? 

■  What  are  possible  interests  of  our  customers  ? 

■  Does  the  data  contain  recurrent  patterns  ? 

■  spatio-temporal 

■  communication  profiles 

■  Is  there  a  deviation  of  normal  behaviour 

■  frequency  change 

■  new  net  members 

■  Is  there  an  indication  for  the  outfall  of  expected 
events  ? 

■  periodicity 

■  member  does  not  communicate 
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ICAP  Base  Components 


Information  processing  aims  at  adding  value  to  information 
Reduction  to  statements  guids  the  development 
Important  topics 

■  Visualisation,  Tracking/GIS 

■  DM/KDD/OLAP 

■  Data  Warehouse 
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Discovery  Goals 


•  Discovery  of  stationary 
emitters 

•  Discovery  of 
mobile  emitters 

•  Discovery  of  simple 
command  structures 
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Anomaly 


missing 

emission 
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model: 
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Exploration  Process 


1. 

2. 


3. 

4. 


data  mining 

emissions  ->  cluster 

(e.g.  spatial  combined  emissions) 

cluster  visualisation 


model  based  detection 

cluster  ->  communication 
structures 

visualisation  of 
communication  structures 
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data  mining 


■clustering  emission  data: 

■  location 

■  frequency 

■  time 

■  data  reduction 

■  determination  of  emitters 

■  spatial  clustering 

■  timeseries  of  emissions 


cluster  visualisation 
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model  based  event  detection 


■  modeling  of  typical  communication  structures 

■  domain  specific  modeling  language 

■  representation  of  communication  models 

■  combination  of  model  components 

■  event  detection 


■  translation  of  models  in  constraints 

■  combinatorial  analysis  of  possible  model  interpretation 


example 


Emitter 


<10  sek 


<10  sek  A  <  1 0  sek 


<10  sek 


St<  rt.HI  End. HI 


Start. H2  End.H2 


Start.H3  End.H3 


Start.N1  End.N1 


Start.N2  End.N2 


Domain  specific  model  based 
communication  detection 


central  station:  H 

sub  station:  N 

connection  phase 
(3  emissions:  HI,  N1,  H2) 

HI. end  <  N1. start; 

(N1. start  -  HI. end)  <=  10 

NI.end  <  H2. start; 

(H2. start  -  NI.end)  <=  10 

alternating  communications 
(every  2  emissions:  H3,  N2) 

N2.end  <  H3.start; 

(H3. start  -  N2.end)  <=  10 
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model  based  event  detection 


simplex  communication 


connecting  phase 

alternating 

communication 

emitter 


central 
station, 
Cluster  20 


sub  station, 
Cluster  790 


ACOS 
emission  id 


103413792 


103414077 


103415371 


103413878  103414976 

- 1 - 1 - 1 - 1 - 1 - 1 - 1 - 1 — ►  time 

6:00:00  6:00:02  6:00:04  6:00:06  6:00:08  6:00:10  6:00:12  6:00:14  6:00:16 


frequency:  9991000.0,  ModType:  MFSK 
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visualisation  of  a 
communication  structure 

example:  frequency  change 


We  have: 


We  are  looking  for: 


communications  in  an  interesting  time  and 
space  window  with  possibly  known  frequency 
ranges 

changes  in  frequency  usage  for  the 
interesting  communications 
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visualisation  of  a 
communication  structure 


example:  command  structures 
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We  are  looking  for:  communication  partners  building  a  special 

structure,  e.g.  ABACADABACAD... 


We  have: 


known  fixed  emitter 


visualisation  of  a 
communication  structure 


example  :  free  search 


We  have:  spatio-temporal  data  selection 

We  are  looking  for:  two  partners  communicating  in  simplex  mode 
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6am^fr  4pm^fr 

At  different  times  the  intensity  of  communication  may  change  heavily. 

We  have:  an  area  of  special  interest 

We  are  looking  for:  changes  in  emission  occurence 
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visualisation  of  a 
communication  structure 


example:  increasing  communication  activity 


Architecture 
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Cooperative  Research 


■Cooperation  with  computer  science  research 
institute  competence  fields: 

■  distributed  systems 

■  software  technique 

■  intelligent  systems 

■  learning  with  new  media 

■  logistics  simulations 

■  usability  and  software-ergonomy 

■  IT-Security 

■  visualisation  and  interactive  media 

■Incorporation  of  the  newest  and  best 
research  result 

■Know-How  and  innovation  transfer  from 
research  into  industry 


Benefit 


The  COPIN  approach  makes  massive  emission  data 
streams  manageable  and  analysable 

■  Domain  specific  orientation  integrates  the  expertise  of  the 
intelligence  analysts 

■  We  aim  at  meeting  the  interests  of  our  customers 

■  COPIN  provides  an  exploration  process  to  successfully 
add  value  to  information: 

1.  data  mining 

2.  cluster  visualisation 

3.  model  based  communication  detection 

4.  visualisation  of  communication  structures 

■  The  model  based  concepts  represent  a  flexible 
independent  way  to  work  with  your  data 

■  The  COPIN  concept  is  applicable  also  to  other  data 
collections. 
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a  clear  signal 
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